A Multivariate Discretization Method for Learning Bayesian Networks from Mixed Data
نویسندگان
چکیده
In this paper we address the problem of discretization in the context of learning Bayesian networks (BNs) from data con taining both continuous and discrete vari ables. We describe a new technique for multivariate discretization, whereby each continuous variable is discretized while tak ing into account its interaction with the other variables. The technique is based on the use of a Bayesian scoring metric that scores the discretization policy for a con tinuous variable given a BN structure and the observed data. Since the metric is rel ative to the BN structure currently being evaluated, the discretization of a variable needs to be dynamically adjusted as the BN structure changes.
منابع مشابه
Augmented Naive Bayesian Classifiers for Mixed-Mode Data
Conventional Bayesian networks often require discretization of continuous variables prior to learning. It is important to investigate Bayesian networks allowing mixed-mode data, in order to better represent data distributions as well as to avoid the overfitting problem. However, this attempt imposes potential restrictions to a network construction algorithm, since certain dependency has not bee...
متن کاملDiscretizing Continuous Attributes While Learning Bayesian Networks
We introduce a method for learning Bayesian networks that handles the discretization of continuous variables as an integral part of the learning process. The main ingredient in this method is a new metric based on the Minimal Description Length principle for choosing the threshold values for the discretization while learning the Bayesian network structure. This score balances the complexity of ...
متن کاملMultivariate Cluster-Based Discretization for Bayesian Network Structure Learning
While there exist many efficient algorithms in the literature for learning Bayesian networks with discrete random variables, learning when some variables are discrete and others are continuous is still an issue. A common way to tackle this problem is to preprocess datasets by first discretizing continuous variables and, then, resorting to classical discrete variable-based learning algorithms. H...
متن کاملLearning Discrete Bayesian Networks from Continuous Data
Real data often contains a mixture of discrete and continuous variables, but many Bayesian network structure learning and inference algorithms assume all random variables are discrete. Continuous variables are often discretized, but the choice of discretization policy has significant impact on the accuracy, speed, and interpretability of the resulting models. This paper introduces a principled ...
متن کاملA Surface Water Evaporation Estimation Model Using Bayesian Belief Networks with an Application to the Persian Gulf
Evaporation phenomena is a effective climate component on water resources management and has special importance in agriculture. In this paper, Bayesian belief networks (BBNs) as a non-linear modeling technique provide an evaporation estimation method under uncertainty. As a case study, we estimated the surface water evaporation of the Persian Gulf and worked with a dataset of observations ...
متن کامل